Genotypic resistance determined by whole genome sequencing versus phenotypic resistance in 234 Escherichia coli isolates

Whole genome sequencing (WGS) enables detailed characterization of bacteria at single nucleotide resolution. It provides data about acquired resistance genes and mutations leading to resistance. Although WGS is becoming an essential tool to predict resistance patterns accurately, comparing genotype to phenotype with WGS is still in its infancy. Additional data and validation are needed. In this retrospective study, we analysed 234 E. coli isolates from positive blood cultures using WGS as well as microdilution for 11 clinically relevant antibiotics, to compare the two techniques. We performed whole genome sequencing analyses on 234 blood culture isolates (genotype) to detect acquired antibiotic resistance. Minimal inhibitory concentrations (MIC) for E. coli were performed for amoxicillin, cefepime, cefotaxime, ceftazidime, meropenem, amoxicillin/clavulanic acid, piperacillin/tazobactam, amikacin, gentamicin, tobramycin, and ciprofloxacin, using the ISO 20776-1 standard broth microdilution method as recommended by EUCAST (phenotype). We then compared the two methods for statistical ‘agreement’. A perfect (100%) categorical agreement between genotype and phenotype was observed for gentamicin and meropenem. However, no resistance to meropenem was observed. A high categorical agreement (> 95%) was observed for amoxicillin, cefepime, cefotaxime, ceftazidime, amikacin, and tobramycin. A categorical agreement lower than 95% was observed for amoxicillin/clavulanic acid, piperacillin/tazobactam, and ciprofloxacin. Most discrepancies occurred in isolates with MICs within ± 1 doubling dilution of the breakpoint and 22.73% of the major errors were samples that tested phenotypically susceptible at higher antibiotic exposure and were therefore considered as ‘not resistant’. This study shows that WGS can be used as a valuable tool to predict phenotypic resistance against most of the clinically relevant antibiotics used for the treatment of E. coli bloodstream infections.

www.nature.com/scientificreports/ to reduced target affinity, alterations of regulatory networks that control the expression of resistance-regulatory proteins, and reduced access to the bacterial membrane. E. coli has a great capacity to accumulate resistance genes through horizontal gene transfer. Plasmids and other mobile genetic elements such as gene cassettes in class one and class two integrons and transposons play a major role in the horizontal gene transfer of resistance genes 5,6 . Genomic analysis, and more specifically whole genome sequencing (WGS), enables detailed characterization of the bacterium at single nucleotide resolution. It provides data on the serotype, pathotype, sequence type (ST), acquired antibiotic resistance, and virulence factors. WGS is therefore a very powerful tool for routine surveillance and outbreak investigation 7 . Although WGS could become an essential tool to predict resistance patterns, clinical laboratories still rely on dilution and diffusion susceptibility testing to guide clinical therapy. Bringing a sequencing-based approach into the routine would be very costly, and requires robust bioinformatics tools and experienced personnel. In addition, genotyping antibiotic resistance is still in its infancy so additional data and validation are eagerly awaited 8 .
In our previous research paper, we performed WGS and broth dilution for amoxicillin/clavulanic acid on E. coli blood isolates 9 . We described the acquired beta-lactamase genes and highlighted the low level of agreement between EUCAST and CLSI methodologies when performing minimal inhibitory concentration (MIC) testing of amoxicillin/clavulanic acid.
In the current research paper, we performed broth dilution for 10 additional antibiotics and analysed the acquired resistance genes and the known chromosomal point mutations leading to resistance on the same isolates using BioNumerics v.8.1 (Applied Maths, Biomérieux, Belgium), which is a software used for functional genotyping based on the most recent Resfinder and Pointfinder databases (http:// www. genom icepi demio logy. org/ servi ces/) combined with private knowledge. In this study, we described the acquired resistance mechanisms present in 234 E. coli isolates. We then looked at whether genotypic resistance matched with phenotypic resistance obtained through broth dilution, for 11 clinically relevant antibiotics to treat E. coli bloodstream infections.
Phenotypic ESBL-detection. The EUCAST disk diffusion method for phenotypic detection of extendedspectrum beta-lactamases (ESBL) was performed on Mueller-Hilton agar (I2A, Montpellier, France) with ceftazidime, ceftriaxone, cefepime, and clavulanic acid. The Mueller-Hilton agar plates were incubated for 24 h at 37 °C. After incubation, the zone of inhibition was measured by SIRscan ® (I2A, Montpellier, France). The EUCAST algorithm was used to interpret the disk diffusion diameters 11 . DNA isolation and whole genome sequencing. Two different methods were used to perform WGS of the E. coli isolates. Genomic DNA was extracted using the Dneasy blood & tissue kit (Qiagen, Hilden, Germany) for 30 samples, and DNA libraries were prepared via the KAPA Hyper Plus kit (Kapa Biosystems, Wilmington, MA, USA). All libraries were sequenced on a MiSeq instrument (Illumina, San Diego, CA, USA) using the v2 (2 × 250 bp) and v3 (2 × 300 bp) reagent kits. For the remaining 203 samples, genomic DNA was extracted using the Maxwell RSC Cell DNA purification kit (Promega Corporation, Madison, USA). Fragmentation of 500 ng of genomic DNA was carried out using the NEBNext ® Ultra™ II FS module. Sequencing libraries, with an insert size of on average 550 bp, were prepared using the KAPA Hyper Plus kit (Kapa Biosystems, Wilmington, USA) and a Pippin Prep (Sage Science, Beverly, MA, USA) size with the CDF1510 1.5% agarose dye-free cassette selection. In order to avoid PCR bias, the PCR amplifications step was omitted and a 500 ng input of genomic DNA was used. After equimolar pooling, libraries were sequenced on a Novaseq 6000 instrument (Illumina, San Diego, CA, USA) using the NovaSeq 6000 SP Reagent Kit (500 cycles) generating 2 × 250 bp reads. To achieve this, the library was denatured and diluted according to the manufacturer's instructions. A 1% PhiX control library was included in each sequencing run. Sequence quality was assessed with FastQC (version 0.11.4) software (https:// www. bioin forma tics. babra ham. ac. uk/ proje cts/ fastqc/).
De novo assembly was performed using SPAdes genome assembler in BioNumerics. The quality of the sequence read sets and the de novo assemblies were verified using the quality assessment tool available in BioNumerics.
The sequenced data were analysed for acquired and mutational resistance and serotypes using the E.coli genotyping tool available in BioNumerics v.8.1 (Applied Maths, Biomérieux, Belgium). The presence of resistance genes was determined with a minimum percentage sequence identity (ID) threshold of 95% and a minimum length for sequence coverage of 95%. This genotyping tool is based on publicly available databases on the Center We confirm that all methods were carried out in accordance with relevant guidelines and regulations. The Ethics Committee of UZ Brussels/VUB decided no informed consent was needed from the subjects.

Results
Resistance. In total, 57 different acquired resistance genes and 18 mutations associated with known resistance phenotypes for aminoglycosides, beta-lactams, trimethoprim, quinolones, phenicol, tetracyclines, sulphonamides, and macrolides are summarized in Figs. 1    Comparing genotype with phenotype. We found an average categorical agreement of 94.13% between our genotypic tests and the reference phenotypic tests. A perfect (100%) categorical agreement between genotypes and phenotypes was observed for gentamicin and meropenem. A high categorical agreement (> 95%) was observed for amoxicillin, cefepime, cefotaxime, ceftazidime, amikacin, and tobramycin. A categorical agreement lower than 95% was observed for amoxicillin/clavulanic acid, piperacillin/tazobactam, and ciprofloxacin.

Amount of Ɵmes idenƟfied
Major errors were observed for all antibiotics, with the exclusion of amoxicillin/clavulanic acid and gentamicin. Although we have to mention that 15/66 (22.73%) of the major errors tested phenotypically 'susceptible, increased exposure' and were therefore considered as not resistant. Very major errors were only observed in amoxicillin, amoxicillin/clavulanic acid and piperacillin/tazobactam.
A perfect (100%) sensitivity was found for all antibiotics, with the exclusion of amoxicillin, amoxicillin/ clavulanic acid, piperacillin/tazobactam, and tobramycin. No sensitivity could be calculated for amikacin due to the lack of phenotypic resistant isolates in our set. High specificity (> 95%) was found for all antibiotics, with the exclusion of ciprofloxacin. Noticeably, 14.89% of the strains phenotypically susceptible to ciprofloxacin harbored one mutation or acquired gene associated with resistance to ciprofloxacin. All 45 resistant strains harbored more than one mutation or acquired resistance gene associated with resistance to ciprofloxacin.
The positive predictive value was found to be very diverse in our dataset: 99.24% for amoxicillin, 36.36% for cefepime, 80.00% for cefotaxime, 60.00% for ceftazidime, 100% for amoxicillin/clavulanic acid, 50.00% for piperacillin/tazobactam, 100% for gentamicin, 75.00% for tobramycin and 64.29% for ciprofloxacin. No positive predictive values could be calculated for amikacin due to the lack of phenotypic-resistant isolates in our dataset. A perfect (100%) negative predictive value was found for all antibiotics with the exclusion of amoxicillin/clavulanic acid, piperacillin/tazobactam, and ciprofloxacin.
All the above-mentioned results are summarized and visualized in detail in Fig. 3 and Supplementary Data 4.

Extended-spectrum beta-lactamases.
Twelve of the 234 (5.13%) isolates were considered ESBL-positive according to the disk diffusion test. In 11/12 (91.67%) of these isolates, an ESBL-gene was identified, with the blaCTX-M-15 gene being the most prevalent one, found in eight of the 12 (66.67%) isolates. Two of the 12 (16.67%) isolates carried a blaCTX-M-1 gene and one of the 12 (8.33%) isolates carried a blaTEM-52C gene. In the one isolate that carried no ESBL-gene but tested phenotypically positive for ESBL, we found a point mutation in the AmpC-promotor known to induce resistance to ceftazidime, cefepime, and clavulanic acid and therefore possibly explaining the ESBL-phenotype. However, this mutation does not cause resistance to ceftriaxone. All the isolates carrying an ESBL-gene tested phenotypically positive for ESBL. If disk diffusion is considered the Gold standard for ESBL-testing, a categorical agreement was found in 11/12 (91.67%) of our samples. We

Discussion
In this research paper, we described the utility of WGS for predicting antibiotic resistance against 11 clinically relevant antibiotics to treat E. coli bloodstream infections. We identified 57 different acquired resistance genes and 18 resistance-associated mutations with known phenotypes. We found an average categorical agreement of 94.13% between our genotypic tests and the reference phenotypic tests. A perfect (100%) categorical agreement between genotypes and phenotypes was observed for gentamicin and meropenem. A high categorical agreement (> 95%) was observed for amoxicillin, cefepime, cefotaxime, ceftazidime, amikacin, and tobramycin. A categorical agreement lower than 95% was observed for amoxicillin/clavulanic acid, piperacillin/tazobactam, and ciprofloxacin. Most discrepancies occurred in isolates with MICs within ± 1 doubling dilution of the breakpoint. Of note, 22.73% of the major errors/discrepancies tested phenotypically as 'susceptible at higher exposure' , and were therefore considered as not resistant (Fig. 3). Bortolaia et al. compared the phenotypic resistance of 584 E. coli isolates based on MIC against 16 antibiotics, with genotypic resistance based on ResFinder 4.0. They found an overall genotype-phenotype concordance of 97%, ranging from 71.6% for cefepime to 100% for most antibiotics. In our study, the average categorical agreement was lower. However, our study included amoxicillin/clavulanic acid and piperacillin/tazobactam, which they did not 8 .
Tyson et al. already reported the utility of WGS for the accurate prediction of antibiotic resistance in E. coli. They described more than 30 acquired resistance genes among 76 E. coli isolates. The resultant resistance genotypes correlated with 99.6% sensitivity and 97.8% specificity to resistance phenotypes. Overall, our percentages of the categorical agreement are lower than in the work of Tyson et al. possibly because we did not retest discrepant results as they did. We also did not make a selection based on multidrug-resistant profiles, which they did: in our study, we included isolates randomly without previous knowledge about their resistance profiles 12 .
Although our results show a high degree of correlation between resistance genotypes and phenotypes for most antibiotics, it is not the case for amoxicillin/clavulanic acid. This is in line with the results of Davies et al. They point out the fact that amoxicillin/clavulanic acid resistance in E. coli is rather quantitative than qualitative and that resistance is built up by many different features, resulting in suboptimal concordance when using binary classification, such as phenotypic classifications 13 . The underlying mechanisms resulting in a resistant phenotype are more complex and variable, and other factors than solely the presence of acquired resistance genes or mutations very likely play an important role. For example, an induced AmpC expression 14 . Before obtaining www.nature.com/scientificreports/ reliable genotypic resistance testing of beta-lactamase inhibitor combinations, such as amoxicillin/clavulanic acid, and to a lesser extent piperacillin/tazobactam, more efforts should be done to reveal the factors determining expression levels of the beta-lactamases.
In the case of ciprofloxacin, as many as 13.23% of strains harboring resistance genes or resistance-associated mutations were phenotypically susceptible. They harbored only one of them, while all strains classified as resistant harbored at least two of them. This was expected, as it is known that resistance to fluoroquinolones requires the accumulation of multiple acquired resistance genes or mutations, including those that alter increased drug efflux 15 . In comparison with dilution and diffusion methods, it is also important to note that genotypic resistance is expressed either as present or absent and does not measure clinical resistance thresholds.
Eleven of the 12 (91.67%) strains with an ESBL-positive phenotype were identified as ESBL-positive with WGS. We observe the co-carriage of blaOXA-1 in ESBL-positive strains, highlighting a link between ESBLproducing bacteria and resistance to amoxicillin/clavulanic acid and piperacillin/tazobactam. Concordant with Livermore et al. the ESBL accompanying OXA-1 was always CTX-M-15 16 .
Antibiotic resistance is a growing problem. Rapid and correct selection of an appropriate antibiotic is of great importance in the clinic, especially in severe and invasive infections such as bloodstream infections. Despite the application of WGS in the prediction of phenotypic resistance profiles are fairly well known, the data available on E. coli blood isolates is limited. Therefore, additional data like those provided in our study is highly needed. In this study performed on randomly selected blood isolates, we demonstrated that WGS could accurately predict the vast majority of resistance phenotypes against the antibiotics we tested. Most discrepant results were observed for beta-lactamase combinations, mainly amoxicillin/clavulanic acid, suggesting that high levels of beta-lactamase production are involved in low-level resistance. More efforts should be done to better understand the factors determining the expression levels of these enzymes.
A limitation of this study is the low resistance rates to some antibiotics such as meropenem, which was therefore excluded from parts of the analysis. It is important to take into account that this study only focuses on unimicrobial bacterial cultures of E. coli as the lead bacterial pathogen in the frame of multimicrobial infections leading to E. coli bacteremia. These multispecies communities might critically influence the antibiotic resistance gene expression of the lead pathogen, often resulting in a lack of accuracy of antimicrobial susceptibility testing of one micro-organism to predict the in vivo success or failure of antibiotic therapy 17 .
There is an undeniable potential for WGS-based techniques to replace dilution and diffusion susceptibility testing to guide clinical therapy. Currently, phenotypical susceptibility testing is still much cheaper and faster than performing WGS. However, this will probably change over time since WGS is getting cheaper. WGS also provides a massive amount of additional information that classic dilution and diffusion susceptibility tests do not provide, such as virulence factors and the relatedness of bacterial isolates 18 . This additional information could be of great value to investigate the bacterial origin, for example.

Conclusion
In conclusion, this study shows that WGS can be used as a valuable tool to predict phenotypic resistance in most of the clinically relevant antibiotics to treat E. coli bloodstream infections, with high specificity and sensitivity. However, excessive beta-lactamase expression, exceeding the activity of inhibitors, leads to a lower accuracy of genotypic tests to detect resistance to beta-lactam combinations. Before applying WGS in a clinical context, the genetic basis of those resistance mechanisms should be unraveled.

Data availability
The datasets generated and/or analysed during the current study are available in the National Library of Medicine repository. BioProject: PRJNA854358.